Cross talk between bacterial and human gene networks enriched using ncRNAs in IBD disease

Inflammatory bowel disease (IBD) is a long-term inflammatory immune-mediated gut illness with several extra-intestinal complications. The aims of this study were to identify a novel network-based meta-analysis approach on the basis of the combinations of the differentially expressed genes (DEGs) from microarray data, to enrich the functional modules from human protein–protein interaction (PPI) and gene ontology (GO) data, and to profile the ncRNAs on the genes involved in IBD. The gene expression profiles of GSE126124, GSE87473, GSE75214, and GSE95095 are obtained from the Gene Expression Omnibus (GEO) database based on the study criteria between 2017 and 2022. The DEGs were screened by the R software. DEGs were then used to examine gene ontology (GO) and the Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways. The ncRNAs including the miRNAs and ceRNAs were predicted on the PPIs visualized using Cytoscape. Enrichment analysis of genes with differential expression (n = 342) using KEGG and GO showed that the signaling pathways related with staphylococcus aureus and pertussis bacterial infections may stimulate the immune system and exacerbate IBD via the interaction with human proteins including Fibrinogen gamma chain (FGG), Keratin 10 (KRT10), and Toll like receptor 4 (TLR4). By building a ceRNA network, lncRNA XIST and NEAT1 were determined by affecting common miRNAs, hsa-miR-6875-5p, hsa-miR-1908-5p, hsa-miR-186-5p, hsa-miR-6763-5p, hsa-miR-4436a, and hsa-miR-520a-5p. Additionally, the chromosome regions including NM_001039703 and NM_006267, which produce the most potent circRNAs play a significant role in the ceRNA network of IBD. Also, we predicted the siRNAs that would be most effective against the bacterial genes in staphylococcus aureus and pertussis infections. These findings suggested that three genes (FGG, KRT10, and TLR4), six miRNAs (hsa-miR-6875-5p, hsa-miR-1908-5p, hsa-miR-186-5p, hsa-miR-4436a, hsa-miR-520a-5p, and hsa-miR-6763-5p), two lncRNAs (XIST and NEAT1), and chromosomal regions including NM_001039703 and NM_006267 with the production of the most effective circRNAs are involved in the ncRNA-associated ceRNA network of IBD. These ncRNA profiles are related to the described gene functions and may play therapeutic targets in controlling inflammatory bowel disease.


Materials and methods
Microarray data. The following search terms were used to identify and download the mRNA microarray expression profile datasets from the GEO database (http:// www. ncbi. nlm. nih. gov/ geo) including inflammatory bowel disease, IBD, Homo sapiens (organism). The datasets were filtered on the data obtained from microarray technique, sample size > 50, gene size > 15,000 and without interventional agents between 2017 and 2022. The finalized microarray data for the meta-analysis were GSE126124 (GPL6244) 26 , GSE87473 (GPL13158) 27 , GSE75214 (GPL6244) 28 , and GSE95095 (GPL14951) 29 . Only samples from each dataset that met the following criteria were chosen to continue the analysis: (I) The GSMs had to be from colon and intestinal tissues (not blood). (II) The tissue should exhibit active inflammation (not inactive). (III) Without receiving any manipulations or treatments, samples from the IBD (Ulcerative colitis (UC) or Crohn's disease (CD)) group. Table 1 and Fig. 1 provided some details and descriptions of the used datasets and tools.

Protein-protein interaction (PPI) network. The Search Tool for the Retrieval of Interacting Genes
(STRING; http:// string-db. org) online database (version 11.5) 30 was used to generate the PPI network, and the following parameters were used: a confidence score > 0.7 is applied as the threshold value to assess the interactions of protein pairs, databases, experiments, and text mining are the active interaction sources. Finding  24 Expression profiling by array GPL14951 2019 ncRNA prediction. The gene hubs were found in the GO combination of the KEGG pathway crosses and other network features. The hubs included the genes connected between the body's signaling pathways and bacterial infections. MiRWalk (http:// mirwa lk. umm. uni-heide lberg. de), a comprehensive database of predicted and validated miRNA-target interactions, was used to predict relevant miRNAs that can modulate the hub genes (default parameters: species: human, Type of IDs: Gene Symbol, score = 1 (The score is calculated from a random-forest based approach by executing TarPmiR algorithm for miRNA target site prediction. Based on the training data, it's showing the probability that this interaction "works"), position = 3'UTR). The networks between the key genes and the most effective miRNAs were drawn on the specific criteria of each miRNA. In the next step, using the DIANA siRNA prediction. The study looked in the proteins of staphylococcus aureus and bordetella pertussis that were interacted with human proteins to trigger or exacerbate the bacterial infections. The bacterial gene sequences involved in the cellular signaling transfection were scanned to predict the nucleotide positions with greater energetic stability for designing of the siRNA using IntaRNA database (http:// rna. infor matik. uni-freib Figure 1. A flowchart representation of the steps involved in the dataset meta-analysis. The differentially expressed genes were identified based on four mRNA datasets. The protein-protein interaction network was generated using STRING database. The gene enrichment was done in Cytoscape software. The interaction between non-coding RNAs and hub genes was identified using databases of miRWalk, DIANA and starBase and finally, ceRNA networks were generated. Statistical analysis and data-processing. The R software tool and the Affy package 34 were used to normalize and standardize the raw data after they had been retrieved from the GEO database (Supplement 1). In expression analysis, normalization and data processing are crucial procedures since they could have an impact on the results of the meta-analysis. Each of the gene samples (GSM) underwent a log2 transformation. Then, the four datasets were merged and combat procedures from the SVA package 35 were applied to the merged data to reduce study-specific batch effects (batch correction). Using the LIMMA package 36 , Student's t-test was done and genes with differential expression levels between the two groups of controls and IBD patients were identified. DEGs were defined as genes with a P-value < 0.01 and |log2FC|> 1.

DEGs in IBD.
After downloading the GSEs (Table 1) Crucial genes were created on the DEG network. A primary network of DEGs (differentially expressed genes) was created using active interaction sources such as experiments, text mining, databases, and interaction scores greater than 0.7 from the STRING server. Then, the network was visualized in Cytoscape software (Supplement 2, nodes 188, edges 403). Additional studies were conducted to discover more crucial genes in the network (118 nodes and 211 edges) on the percentiles (75th) of the node (logFC of each gene) and edge (the sum of the scores of the edges in STRING server) scores (Fig. 4). The sub-networks containing 3 or less than 3 nodes were removed, and the final network were used for enrichment analysis.
DEG subnetworks were enriched using GO and signaling pathways. Using the ClueGO and DAVID tools, functional annotation and pathway analyses, including GO (Biological Process, Molecular Function) and KEGG analyses, were carried out to further perform a systematic characterization and investigate the biological functions of the identified DEGs in IBD vs. control samples (Fig. 5, supplement 3). According to the findings of the gene ontology analysis, the genes with significant differences were enriched in the extracellular matrix structure, regulation of complement activation, chronic inflammatory response, regulation of apoptotic cells, inflammatory response, and response to vitamin. The DEGs in IBD were then found to be primarily enriched in pertussis, staphylococcus aureus infection, cytokine-cytokine receptor interaction, IL-17 signaling, and viral protein interaction with cytokine and cytokine receptor, according to KEGG pathway analysis (Fig. 6).
Associations of bacterial and human proteins were determined. By considering KEGG enrichment ( Fig. 6) and examining the pathways of bacterial infections in humans, it was found that bordetella pertussis infection by association with human proteins (CXCL5, CXCL6, IL6, IL1B, IL1A) and staphylococcus aureus infection by association with 2 human proteins (ICAM1, SELP) affect inflammatory reactions in the human host ( Fig. 7). Regarding the upstream axes of SELP and ICAM proteins, it was discovered that the infection is started by bacterial proteins (ClfB, IsdA, sdrC, sdrD, and SasG) that are interacted to human proteins (KRT10, FGG), and actually results in an infection that stimulates the inflammatory systems in the human (Fig. 7A). Additionally, the human upstream proteins of CXCL5, CXCL6, IL6, IL1B, and IL1A were evaluated in the process of pertussis infection. The bacterial PtxE, PagP, PtxD, PtxC, PtxA, and PtxB proteins are crucial by interacting with the human proteins MD2, CD14, and TLR4. These key proteins promote the development and spread of infections (Fig. 7B). ncRNAs were predicted for human genes. Small noncoding RNAs called microRNAs (miRNAs) were chosen for FGG, KRT10, MD2, TLR4, CD14 genes in the pathway upstream. In miRNA-gene network, the edges were weighted by defining of sum of the scores (accessibility, me, and energy) of each miRNA in the miRWalk database (Supplement 4). The network was finally limited on the miRNAs that were shared among genes (Fig. 8).
The lncRNAs were determined on the network of interactions between common miRNAs and genes (Supplement 5). In addition, the interactional network was created between common miRNAs and the most effective circRNAs that regulate their activity (Fig. 9). siRNAs were predicted for bacterial genes. The bacterial proteins are proposed in staphylococcus aureus (ClfB, IsdA, sdrC, sdrD, and SasG), and in bordetella pertussis (PtxE, PagP, PtxD, PtxC, PtxA, and PtxB) in IBD. The complete genomes of staphylococcus aureus and bordetella pertussis were taken from the NCBI database. The nucleotide sequence of the genes (ClfB, IsdA, sdrC, sdrD, and SasG, PtxE, PagP, PtxD, PtxC, PtxA, and PtxB) involved in the occurrences of these infections in humans that may stimulate the path leading to inflammatory bowel disease were scanned for prediction of best positions for designing of siRNAs. In the intaRNA database, the most effective siRNAs on gene function were predicted based on the energy released from the hybridization of siRNA and mRNA (Fig. 10).

Discussion
The types of inflammatory bowel disease (IBD) including both Crohn's disease (CD) and ulcerative colitis (UC), are characterized by an abnormal immunological response 37 . The development of IBD points to a complicated subject between numerous important factors 38 . The correct understanding of relationships between these factors clears the pathophysiology of IBDs. IBDs may therefore have potential diagnostic and therapeutic targets if the molecular mechanisms of IBDs are understood using high-throughput technologies, which has been widely employed to study the pathogenic processes of IBD 39,40 .
In the current investigation, 342 DEGs, including 193 upregulated and 149 downregulated genes, were identified in the IBD samples. The top 75% module of the network's DEGs was subjected to a GO analysis, and the results showed that DEGs were particularly enriched in cytokine activity, cytokine receptor binding, inflammation, response to bacteria, response to molecules of bacterial origin, leukocyte migration, response to lipopolysaccharide, and cytokine-mediated signaling pathways. The aforementioned GO keywords have been shown in prior research to be potentially significant events in the etiology of IBDs 27,39,41,42 .
According to the results of the KEGG pathway analysis, the DEGs were particularly enriched in the signaling pathways for cytokine-cytokine receptor interaction, IL-17, TNF, viral protein interaction, rheumatoid arthritis, lipid and atherosclerosis, chemokine, pertussis, and staphylococcus aureus infection. Previous studies have shown that the development of inflammatory bowel disease is caused by the interaction of immune system molecules (cytokines and interleukins, etc.), cells (macrophages, lymphocytes (B cells and T cells), etc.), signaling pathways (TNF signaling pathway and chemokine signaling pathway, etc.) and other factors (foreign microbes or immune system-stimulating drugs) 43,44 . Immunological system activation causes immune reactions, gastrointestinal system inflammation, and the onset of IBD symptoms 45 .
According to several studies, mycobacterium paratuberculosis stimulates the immune system's signaling pathways, which may trigger or aggravate inflammatory bowel disease [46][47][48][49] . Some studies have shown that by changing people's diets towards fast foods with high sugar, the microbial population of the digestive tract changed and harmful bacterial species (B. vulgatus, B. thetaiotamicron, E. coli invasive, Intestinal Helicobacter) replaced the beneficial bacterial species (Lactobacillus, Nonpathogenic E. coli, and S. boulardii) in the digestive tract,   www.nature.com/scientificreports/ Although previous studies have found traces of bacterial infection by staphylococcus aureus and pertussis in IBD patients but the pathways and genes involved in the potential incidence of bowel inflammation by infectious agents of staphylococcus aureus and pertussis are interested to find [11][12][13]39 . About 50% of healthy adults have persistent or sporadic staphylococcus aureus colonization of the skin around their noses, but it has recently been identified as a key early colonizer of an infant's gut 52,53 . There is little information on the intestinal colonization and fecal transport of methicillin-resistant s. aureus (MRSA) in IBDs 54,55 .
In this study, the onset of inflammatory bowel disease due to the possibility that infections caused by staphylococcus aureus and pertussis were evaluated. According to the enrichment, there were identified the most susceptible pathways and genes that could be studied to prevent inflammation caused by the bacteria.
According to research, KRT10 (keratin type I cytoskeletal protein 10) increases in people with inflammatory bowel disease and physically alters and rearranges the structure of the gastrointestinal tract's epithelial cells. This alteration affects the permeability of the epithelial layer and the cell's ability to self-renew. These conditions make it possible for pathogens to enter the body's interior layers and induce bacterial infection like IBD 56,57 . Some studies suggested that FGG (fibrinogen gamma chain) plays as an early mediator in the cross-talk between coagulation and inflammation, and it can stimulate immune system reactions by interacting with bacterial agents and immune system cells 58 .
The ability of staphylococcus aureus to adhesion to squamous epithelial cells was consistently shown to be influenced by ClfB (clumping factor B), IsdA (iron-regulated surface determinant protein A), SdrC (serineaspartate repeat-containing protein C), and SdrD (serine-aspartate repeat-containing protein D). A mutant with no copies of all four proteins exhibited total adhesion defects. A vaccine against two or more of these surface proteins could significantly reduce carriage as well as the likelihood of infection and spread [59][60][61][62] . It has been demonstrated that staphylococcus aureus surface proteins SasG (surface protein G) and Pls (surface protein of certain methicillin-resistant Staphylococcus aureus strains), which were previously discovered through in vitro studies of genome sequences, also contribute to bacterial adhesion to epithelial cells, and in their presence, the adhesion strength of staphylococcus aureus to epithelial cells increases 63 .
The pertussis lipopolysaccharides (LPS) and lipooligosaccharides (LOS) are ligands for the receptor complexes that include the Toll-like receptor 4 (TLR4), MD-2, and CD14. Inflammatory and immunological defense responses are triggered by the stimulations of these receptor complexes, which activate signaling pathways like NF-KB and causes the expression of antimicrobial genes and the release of cytokines, which can lead to inflammation 64 . MD-2 is a small secreted glycoprotein that binds to the hydrophobic region of LPS and the extracellular domain of TLR4. TLR4 is activated by the interaction between MD-2 and LPS 65 . After LPS stimulation, CD14 inhibits the rise in barrier permeability and leads to activation of the NF-kB signaling pathway 66 . www.nature.com/scientificreports/ A typical A-B toxin is pertussis toxin (PTX). ADP-ribosyl transferase activity is seen in the A-subunit (PTXA or S1 subunit). The B-subunit, which is made up of the S2-S5 subunits (PTXB, PTXC, PTXD, and PTXE), binds cell surface molecules to enable the toxin to enter the cells. The PTXA ADP-ribosylates the α subunits of heterotrimeric Gi/o proteins, preventing the Gαi/o proteins from inhibiting adenylyl cyclase (AC). One of the ways that PTX causes the many pathogenic consequences in host cells is by the alteration of the Gi/o proteins, which leads to an increased buildup of cAMP. An intracellular signal transduction cascade is activated when the B-subunit interacts with the Toll-like receptor 4 67,68 . A palmitate residue is transferred by PagP (lipid IVA palmitoyl transferase) from a phospholipid's sn-1 position to the N-linked hydroxymyristate on the proximal unit of lipid A. One outer membrane enzyme involved in the production of lipid A is PagP 69 .
Non-coding RNAs, including miRNAs, circRNAs, and lncRNAs, have been shown to play intricate roles in the development of IBD 15 . By binding to miRNAs in a competitive manner, circRNAs and lncRNAs can control the expression of genes. More and more research points the roles for circRNA-miRNA-mRNA axis or lncRNA-miRNA-mRNA axis in the development of disorder [70][71][72] . For example, by increasing the expression of miR-146b and its effect on the Siah2 gene, NF-KB pathway becomes more active and causes an increase in the intestine inflammation 73 . The increase of miR-126 also has the same effect by affecting the IκBα gene 19 . CircRNA_103516 in PBMCs is reported as an excellent potential biomarker for the diagnosis of IBD. Through hsa-miR-19b-1-5p sponging, dysregulation of circRNA_103516 may contribute to the molecular basis of IBD 74 . The discovery of lncRNA IFNG-AS1 as a new regulator of IFNG inflammatory responses and the deregulation of lncRNA signatures in UC suggest the potential significance of noncoding RNA pathways in the control of inflammatory bowel disease 75 .
In this study, five important human genes (KRT10, FGG, TLR4, CD14, and MD2) reported in the development and spread of pertussis and staphylococcus aureus infections. Using the miRWalk database, we created an interaction network between some genes (KRT10, FGG, TLR4) and common miRNAs. In addition, the results showed that NM_001039703 regions with miR-186-5p, miR-4436a, and miR-520a-5p and also NM_006267 regions with miR-186-5p, miR-6763-5p, and miR-520a-5p are hotspots that produce circular RNAs on the The ceRNA network including miRNA-lncRNA-mRNA showed that lncRNA XIST and NEAT1 have the most important role among other lncRNAs in the network by affecting five common miRNAs related to human genes involved in staphylococcus aureus infection and pertussis. Some studies investigated the effects of lncRNA XIST on immune response. Through modifying the expression of ASF1A and BRWD1, lncRNA XIST can act as a ceRNA to sponge hsa-miR-212-3p to control inflammation and apoptosis in the development of AKI (acute kidney injury) 70 . The expression of proinflammatory cytokines generated by E. coli or S. aureus was considerably boosted by XIST silencing. Additionally, NF-κB phosphorylation caused by E. coli or S. aureus and the formation of NLRP3 inflammasome were suppressed by XIST. However, there were some controversies on XIST function but these findings implied that XIST inhibits NF-κB pathway activation triggered by E. coli or S. aureus 76 . Another study suggested that the stimulatory action of XIST was related to NF-κB. In conclusion, OSAHS (obstructive sleep apnea/hypopnea syndrome) patients had inflammation in their adenoids that was mediated by the XIST-GRα-NF-κB signaling pathway 77 .
The lncRNA NEAT1 is associated to the innate immune response. It seems that this lncRNA has a function in triggering the immune response by activating the kappa pathway 15 . NEAT1 was highly expressed in IBD and played a role in the inflammatory response by controlling the intestinal epithelial permeability and polar macrophages via exosomes 78 . According to some researches, TNFRSF1B may contribute to colitis by promoting inflammation. It was discovered that NEAT1 plays a pro-inflammatory effect via controlling TNFRSF1B (tumor necrosis factor superfamily member 1B), and facilitates NF-κB p65 entrance into the nucleus 79 . Studies suggested that the Neat1-miRNA204-5p axis plays an important role in the regulation of the PI3K-AKT pathway, which plays an important role in activating the pathways that lead to more activity of the immune system 80 . Studies have shown that the increase of NEAT1 has been observed in various cancers, liver diseases, non-alcoholic fatty liver disease (NAFLD), and liver fibrosis 81 . Although most studies show an increase in NEAT in IBD, more experimental tests are needed to determine the exact role of lncRNA NEAT1. Other predicted lncRNAs (SNHG3, SNHG7, NORAD, AC023509.1, AC016876.2 and AC021078.1) can also be good subjects for further experimental tests to determine their exact role in bacterial infection and inflammation.
Considering that inhibiting bacterial genes in staphylococcus aureus and pertussis bacterial infections can prevent the infection process. siRNAs as silencing of these genes may be considered effective to deal with staphylococcus aureus and pertussis infections, there were able to identify the most stable and sensitive spots for the action of siRNAs. The siRNA was functional in lowering the bacterial burden in a mouse model of hematogenous Figure 8. The network of miRNAs and human genes involved in the staphylococcus aureus and bordetella pertussis bacterial infections. The pertussis and staphylococcus aureus pathways transduced the message via CD14, TLR4, KRT10 and FGG. The edge (from green to red) showed a rise in the trustworthiness of the connections between nodes. The important miRNAs affecting genes KRT10 and TLR4 predicted hsa-miR-1229-5p, has-miR-887-3p and hsa-miR-1185-1-3p. The important miRNAs targeting of genes of CD14 and TLR4 predicted hsa-miR-4515. However, the important miRNAs affecting genes FGG and TLR4 suggested hsa-miR-4462, hsa-miR-584-5p and has-miR-1207-5p.  82 . It was shown that rectal administration of siRNA targeting TNF-a resulted in relative mucosal resistance to experimental colitis using a mouse model of inflammatory bowel disease 83 . Nanoparticles with surface CD98 antibody and CD98 siRNA (siCD98) loaded on them suppressed the production of this protein in colonic epithelial cells and macrophages 84 .
In conclusion, the gene hubs obtained from cross talking of human and bacterial pathways were the subjects to find the ncRNA profiles. These regulatory sets can simultaneously affect the human and bacterial gene groups and may be effective therapeutic agents in IBD. Considering that human KRT10, FGG, and TLR4 genes play an important role in the infection by these bacteria, based on the ceRNA networks, lncRNA XIST and NEAT1 suggested to play an important role in regulating the function of hsa-miR-6875-5p, hsa-miR-1908-5p, hsa-miR-186-5p, hsa-miR-6763-5p, hsa-miR-4436a, and hsa-miR-520a-5p, which may regulate the activity of genes involved in the occurrence of staphylococcus aureus and pertussis bacterial infections. The production of circRNAs in NM_001039703 and NM_006267 positions affect miR-186-5p, miR-4436a, miR-520a-5p, miR-186-5p, miR-6763-5p, and miR-520a-5p that act on the functions of KRT10, FGG, and TLR4 genes. The more experimental studies can determine the precise roles of these ncRNAs and genes in the process of causing inflammatory bowel disease. We also predicted the most effective cores for siRNAs to affect staphylococcus aureus (ClfB, IsdA, sdrC, sdrD, and SasG) and pertussis (PtxE, PagP, PtxD, PtxC, PtxA, and PtxB) bacterial genes that interacted with human KRT10, FGG, and TLR4 genes. These genes are also known in other bacteria. However, the aforesaid putative genes and ncRNAs in IBD should therefore be confirmed by additional research and experiments with larger datasets.

Figure 9.
Communication network between human genes involved in bacterial infections (green nodes), miRNAs (pink nodes) and circRNAs. The edges between miRNAs and circRNAs (yellow to red) increased on the numbers of circRNAs corresponding to that chromosomal region that affects the miRNAs. The circRNAs nodes (from green to red) are also based on the experimental and laboratory information of the presence of circRNAs in that chromosomal region. The chromosomal regions including NM_001039703 and NM_006267 had the greatest effects on the interaction network between miRNAs and human genes involved in staphylococcus aureus and pertussis bacterial infections.

Data availability
The datasets (GSE126124, GSE87473, GSE75214, and GSE95095) used during the current study are available from NCBI, GEO (https:// www. ncbi. nlm. nih. gov/ geo/). However, other data analyzed in the article can present by the corresponding author on reasonable request.